CMP L2 NUCA Cache Power Consumption Reduction Technique

نویسندگان

  • P. Foglia
  • C. A. Prete
  • M. Solinas
  • F. Panicucci
چکیده

We analyze how applications use banks in a large shared CMP L2 D-NUCA cache depending on their locality and we define a power consumption model. Then we develop a mechanism to dynamically turn on and off a bankcluster in order to reduce the energy consumption. 1. CMP Way Adaptable Our system is a large shared L2 D-NUCA cache in CMP environment (Figure 1). In this architecture each bank is accessible apart and the data can migrate within the cache memory to approach toward the CPU which uses it. Figure 1. Cache Architecture To perform our analysis we first developed a power model to evaluate the total energy cost of this system, in order to estimate both the static (leakage) and dynamic component of the energy consumption (Figure 2) and we observed that the leakage is the dominant source of power consumption. Moreover we analyzed how D-NUCA banks and bankclusters are accessed (Figure 3). We noticed that often the most part of the hits occur in bankcluster that are local to the CPUs in which the applications run, whereas some of the other bankcluster present a fewer number of accesses. Since our objective is to reduce the energy consumption of our system, we plan to develop a mechanism to dynamically turn on and off a bankcluster basing on the freFigure 2. Total Nornalized Energy quency of access. We count the number of accesses to each bankcluster within a run interval and we compare this value with the referential value we have before estimated. Then we decide if there is unused memory and it is possible to turn off a bankcluster or if the system needs more cache memory and we have to turn on a bankcluster. This could be considered an extension of the way adaptable technique we presented in [1] for monoprocessor systems. Figure 3. Bankcluster Hits in One Run By adopting this mechanism we aim to reduce the energy consumption because we use only the cache memory the application needs. Moreover we aim to decrease both network traffic for data search and delay access to bankcluster because there are less banks to visit during the data search and so there are less packets going through the network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Way adaptable D-NUCA caches

Non-uniform cache architecture (NUCA) aims to limit the wire-delay problem typical of large on-chip last level caches: by partitioning a large cache into several banks, with the latency of each one depending on its physical location and by employing a scalable on-chip network to interconnect the banks with the cache controller, the average access latency can be reduced with respect to a traditi...

متن کامل

Techniques for reducing power consumption in CMP Nuca cache

Current trend of technology scaling makes it possible to put a huge number of transistors on a single die. While dynamic power consumption can benefit from technology scaling, static power consumption get worse, thus making the latter the dominant factor of power consumption in future microprocessors system. As on-chip cache memories require the most part of chip area and number of transistors,...

متن کامل

Techniques for reducing power consumption in CMP NUCA caches

Current trend of technology scaling makes it possible to put a huge number of transistors on a single die. While dynamic power consumption can benefit from technology scaling, static power consumption get worse, thus making the latter the dominant factor of power consumption in future microprocessor systems. As on-chip cache memories require the most part of chip area and number of transistors,...

متن کامل

Adaptive Zone-Aware Multi-bank on Chip last level L2 Cache Partitioning for Chip Multiprocessors

This paper proposes a novel efficient Non-Uniform Cache Architecture (NUCA) scheme for the Last-Level Cache (LLC) to reduce the average on-chip access latency and improve core isolation in Chip Multiprocessors (CMP). The architecture proposed is expected to improve upon the various NUCA schemes proposed so far such as S-NUCA, D-NUCA and SP-NUCA[9][10][5] in terms of average access latency witho...

متن کامل

Implementation Issues of Way Adaptable D-NUCA Caches

In a Way Adaptable D-NUCA cache the number of active ways is dynamically varied, according to the needs of the running application, resulting in a reduction of the static power consumption without affecting performance. Because of the need of constraining the power consumption when powering up of a way, in an actual implementation of a Way Adaptable D-NUCA cache, the new way becomes available w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008